-
Notifications
You must be signed in to change notification settings - Fork 218
Optimized serving endpoint for classical ML models #3377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Optimized serving endpoint for classical ML models #3377
Conversation
Co-authored-by: Miłosz Żeglarski <[email protected]>
Co-authored-by: Miłosz Żeglarski <[email protected]>
@@ -0,0 +1,2 @@ | |||
SepalLength,SepalWidth,PetalLength,PetalWidth |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we don't need that in the repo since we have an instruction on preparing the data.
@@ -0,0 +1,508 @@ | |||
SepalLength,SepalWidth,PetalLength,PetalWidth,Species |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we don't need that in the repo since we have an instruction on preparing the data.
|
||
output_dir = sys.argv[1] | ||
|
||
#os.makedirs(output_dir, exist_ok=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's remove the comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a complete machine learning pipeline project for the Iris dataset using OpenVINO Model Server (OVMS) with MediaPipe. The project supports both logistic regression and k-means clustering models with training and inference capabilities through client scripts.
Key changes:
- Complete pipeline implementation with PyTorch logistic regression and scikit-learn k-means models
- Client scripts for training and inference with flexible parameter handling
- Docker-based deployment with Intel optimization libraries
- Data preprocessing utilities and example datasets
Reviewed Changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 5 comments.
Show a summary per file
File | Description |
---|---|
extras/iris_pipeline_project/pipeline/ovmsmodel.py | Core OVMS Python model implementing training/inference logic for both model types |
extras/iris_pipeline_project/pipeline/model.py | Abstract model classes with PyTorch logistic regression and scikit-learn k-means implementations |
extras/iris_pipeline_project/pipeline/graph.pbtxt | MediaPipe graph configuration for the pipeline |
extras/iris_pipeline_project/model_config.json | OVMS model configuration |
extras/iris_pipeline_project/labelmap.json | Label mapping for Iris species |
extras/iris_pipeline_project/kmeans_param.json | K-means hyperparameters |
extras/iris_pipeline_project/hyperparams.json | Logistic regression hyperparameters |
extras/iris_pipeline_project/data_preprocess.py | Data preprocessing script for Iris dataset |
extras/iris_pipeline_project/data_folder/iris_train.csv | Training dataset |
extras/iris_pipeline_project/data_folder/iris_test.csv | Test dataset |
extras/iris_pipeline_project/client/client_train.py | Client script for training models |
extras/iris_pipeline_project/client/client_inference.py | Client script for inference |
extras/iris_pipeline_project/README.md | Project documentation |
extras/iris_pipeline_project/Dockerfile | Docker configuration with Intel optimizations |
extras/iris_pipeline_project/.gitignore | Git ignore file |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
import sys | ||
|
||
if len(sys.argv) < 2: | ||
print("Usage: python datapreprocess.py <output_directory>") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The filename in the usage message should match the actual filename: 'data_preprocess.py' instead of 'datapreprocess.py'.
print("Usage: python datapreprocess.py <output_directory>") | |
print("Usage: python data_preprocess.py <output_directory>") |
Copilot uses AI. Check for mistakes.
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
For Enabling accelerator support: | ||
Manually set the ```bool - (use_ipex/use_oneDAL)``` in model.py file under "pipeline" directory to either True/False depending on the necessity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We needed that for benchmarks.In the guide we should not require user to modify the code.
Either remove it completely and use acceleration by default or pass it in hyperparameters JSON.
|
||
model_obj = AVAILABLE_MODEL_CLASSES[model_class_name]() | ||
|
||
if model_class_name == "KMeansSkLearn": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like this conditional logic. execute
method should be generic and wrap all possible implementations. Having any model specific code goes against this idea. Please move those implementation specific code blocks to its classes.
You define the interface and convention. You can make fit
or infer
method return not only results, but also metrics that can be used directly to create pyovms.Tensor object like:
trained_model, training_metrics = model_obj.fit(X, y, params))
...
return [Tensor("pipeline_output", training_metrics)]
🛠 Summary
GSOC Contribution for the project - Optimized serving endpoint for classical Machine Learning models.
Current status: Finished working on Logistic Regression and KMeans.
Mentors:
Zeglarski, Milosz
Trawinski, Dariusz